Unification of XML Document Structures for Document Warehouse (DocW)
نویسندگان
چکیده
Data warehouses and OLAP (On Line Analytical Processing) technologies analyse huge amounts of structured data that companies store as conventional databases. Recent works underline the importance of textual data for the decision making process and, therefore, lead to build document warehouses. In fact, documents help decision makers to better understand the evolution of their business activities. In general, these documents exist in XML format, are geographically distributed and described by multiple and different structures. This paper deals with a method to build a distributed document warehouse. This method consists of two steps: i) unification of XML document structures in order to set a global and generic perception/view of the distributed document warehouse, and ii) multidimensional modeling of unified documents for decisional purposes. More specifically, this paper focuses on the unification step.
منابع مشابه
Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملApply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML
As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...
متن کاملC-warehousing: a Hl7 Cda-based Approach for the Secondary Use of Clinical Data
This paper proposes a semi-automatic approach to extract information stored in a HL7 Clinical Document Architecture (CDA) and transform them to be loaded in a Data Warehouse for secondary purposes. It represents a suitable solution to facilitate the design and implementation of Extract, Transform and Load (ETL) tools that are considered the most time-consuming step of the data warehouse develop...
متن کاملConceptual Design of XML Document Warehouses
EXtensible Markup Language (XML) has emerged as the dominant standard in describing and exchanging data among heterogeneous data sources. XML with its self-describing hierarchical structure and its associated XML Schema (XSD) provides the flexibility and the manipulative power needed to accommodate complex, disconnected, heterogeneous data. The issue of large volume of data appearing deserves i...
متن کاملConversion of XML Schema to Data Warehouse Schema using Automatic Approach
eXtensible Markup Language (XML) is data exchange format for representation data in Web based system. XML is used by many organizations for e-commerce and internet based applications such as online shopping, digital library, and electronic devices and so on. XML data is not sufficient to analyze on the Web. So XML is required to systematically analyze by industrial organizations to enable enhan...
متن کامل